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Abstract 

A method designed to assist practitioners in the interpretation of the practical significance of a 
statistically significant logistic regression coefficient is presented. To avoid the interpretation 
problems encountered when using the traditionally reported change in either the log odds or odds 
values, this method centers the interpretation on the change in the probability value of the event 
occurring associated with a given change in the independent variable for a range of initial 
probability values. To assist in the implementation of this method, a computer program, which is 
used in conjunction with the Microsoft® Excel software, was designed to compute these 
probability values. 
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Expressing Logistic Regression Coefficients as 

Changes in Initial Probability Values: Useful Information for Practitioners 

It is not uncommon for program evaluators and researchers to encounter a variety of 
situations in which the dependent variable of interest is dichotomous (i.e., the variable consists of 
two categories). For example, a study by Graham (2002) investigated whether specific teacher 
background variables could discriminate between groups of teachers who were classified as 
belonging to either the early or late stages of concern regarding the implementation of an 
educational change in their schools. Schreiber (2002) attempted to determine if certain student 
background and academic factors could be used to identify whether the students scored above or 
below the international mean on an advanced mathematical examination. In a study by 
McCoach and Siegle (2001) attitude variables were analyzed to determine if they could 
accurately identify whether a student was a gifted achiever or a gifted underachiever. And 
Hendel (2001) examined if the students' participation in a first-year seminar and various 
academic and background variables were associated with whether they did or did not return for 

their second year of college. 

When such dependent variables are encountered, it is not uncommon for researchers and 
program evaluators to use logistic regression analysis, as was the case for the four previously 
cited studies. One problem confronting these researchers and program evaluators is how to 
convey the meaning of a significant logistic coefficient in a manner that practitioners find 

meaningful and useful (Cabrera, 1994). 

Various researchers have stressed the importance of not only reporting whether 
parameter estimates, such as logistic regression coefficients, are statistically significant but also 
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the need to provide information that can be used to judge the practical significance of these 
parameter estimates (Fisher, 1925; Cohen, 1969, 1988; Fraas & Newman, 2000; Levin & 
Robinson, 2000; Robinson & Levin, 1997; Thompson, 1996). Since a logistic regression 
coefficient measures the change in the log of the odds of the event occurring associated with a 
one-unit change in the independent variable, many practitioners may find such a value difficult 
to use in order to judge the practical significance of the change. Thus the question addressed in 
this paper is: What information regarding statistically significant logistic regression coefficients 
can researchers provide to practitioners that will allow them to best deal with the concept of 
practical significance? 

This paper presents a method of expressing a logistic coefficient as a series of probability 
values where each value represents the change in a person’s probability of belonging to the group 
assigned the value of one in the dependent variable (e.g., the teachers who were classified as 
belonging to the late-stages-of-concem group in the Graham [2002] study) associated with a 
one-unit change in the independent variable. We believe these probability values will provide 
researchers, program evaluators, and practitioners with information that will assist them in 
judging the practical significance of a change in the dependent variable that is associated with a 
given change in the independent variable as measured by the coefficient. 

In addition to presenting this method of expressing a logistic coefficient as a series of 
probability values, this paper describes how a researcher or a practitioner can use a computer 
program, which is used in conjunction with the Microsoft® Excel software, to generate these 
probability values. A researcher or practitioner who wants to use the program can enter these 
commands (see Appendix A) into the Microsoft® Excel software or they can request a copy of 
the program from the authors by sending such a request tojfraas@ashland.edu. 



Expressing 5 



Interpreting Logistic Regression Coefficients 

In a logistic regression model, the relationship between the binary dependent variable and 
a given independent variable is assumed to follow a logistic function (Collett, 1991). The 
logistic function takes the following form: 



Prob(event) 
(1- Prob(event)) 



B 0 + BjXj + B2X2 + . . . + BpXp 



where: 



1. Prob(event) refers to the probability that the person will belong to the group assigned 
a value of one in the original dependent variable. 

2. The symbols Xi, X 2> and Xp represent the independent variables. 

3. The symbols B 0 , Bi, B2, and Bp refer to the intercept and coefficients. 

To estimate the logistic regression model, the logistic curve is fitted to the actual data. In the 
process of fitting the logistic curve to the data the values of the dependent variable are 
transformed to the logarithm of the odds (i.e, the log odds) that a person will belong to the group 
assigned a value of one in the original dependent variable. As noted by Cizek and Fitzgerald 
(1999), this transformation results in log odds values that are measured on an equal interval 
scale. Often these log odds values are referred to as logits, which is the contraction of the terms 

logistic and units. 

The logistic transformation of the dependent variable causes the coefficients estimated 
for a logistic regression model to differ from those obtained from an Ordinary Least Squares 
(OLS) regression model. As noted by Cabrera (1994), "unlike OLS, the metric of the individual 
coefficients under logistic regression is expressed in terms of logits rather than in terms of the 
original scale of measurement" (p. 245). Thus the logistic coefficient for a given independent 



O 

ERIC 



6 



Expressing 6 

variable is interpreted as the change in the log odds associated with a one-unit change in that 
variable. 

To illustrate the interpretation of a logistic coefficient, consider the results of a study 
conducted by Graham (2002). In Graham's study, the dependent variable identified whether 
teachers were considered to be in the early or late stages of concern regarding the 
implementation of educational innovation in their schools. Teachers in the early-stages-of- 
concem group were assigned values of zero, while the teachers in the late-stages-of-concem 
group were assigned values of one. The logistics regression coefficient for the independent 
variable that represented the number of years the teachers worked with their current principals 
was statistically significant. The value for this variable, which was .161, indicates that an 
additional year of working with the current principal was associated with an increase of 16 1 in 
the log odds of belonging to the late-stages-of-concem group. 

Practitioners, policy makers, and researchers may find that such an interpretation of a 
logistic regression coefficient does not provide information that readily lends itself to judging the 
practical significance of the coefficient. We believe it is important for researchers who use 
logistic regression to express the change in the dependent variable associated with a change in an 
independent variable in a manner that practitioners and policy makers can use to judge the 
educational or practical significance of the change. 

Methods Commonly Used to Interpret Logistic Regression Coefficients 

Two commonly used methods of interpreting statistical significance are (a) the odds ratio 
value, and (b) Delta-p statistic. With regards to the using of the odds ratio value, Cizek and 
Fitzgerald (1999) stated that: "Information related to the odds— as opposed to log odds— of an 
event occurring is easier to understand and communicate" (p. 230). Commenting on the use of 
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the Delta-p statistic in conjunction with a logistic regression coefficient, Petersen (1985) 
suggested that it is a suitable method of estimating the change in the dependent variable 

associated with the overall change in the independent variable. 

The odds ratio value method To illustrate how a researcher could communicate the 
meaning of a logistic regression coefficient by using the odds ratio value, consider the logistic 
regression coefficient for the variable representing the years of working with the principal in the 
Graham (2002) study, which was .161. Raising e, which is the base of the natural logarithm, to 
the power of .161, produces an odds ratio value of 1.17. This value indicates the factor by which 
the odds change when the years of working with the principal increases by one year. Thus an 
increase of one unit in the independent variable, which is one additional year of working with the 
principal, increases the odds of belonging to the late-stage-of-concem group (i.e., the group 
assigned a value of one in the dependent variable) by a factor of 1 . 1 7. 

Delta-p statistic method The Delta-p statistic estimates that change in the probability of 
a person belonging to the group assigned a value of one in the dependent variable that is 
associated with a one-unit change in a given independent variable. It is important to note that 
this change is calculated for only one initial probability value. The initial probability value used 
is set equal to the mean of the dependent variable, which represents the proportion of the sample 
belonging to the group assigned a value of one in the dependent variable. The calculation of the 
Delta-p value is based on the following equation. 

MPi/(l-Pi))+b 

6 -ry 

P c = 1+e ln(Pi/(l-Pi)+b ‘ 1 
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where: 

1 . P c represents the change in the probability value. 

2. Pi represents the initial probability value. This value is set equal to the proportion of 
the sample belonging to the group assigned the value of one in the dependent variable 
(i.e., the sample mean of the dependent variable). 

3. The symbol e represents the base of the natural logarithm. 

4. The symbol b represents the logistic regression coefficient for the given predictor 
variable. 

The interpretation of the Delta-p statistic, which is calculated as the initial probability equal to 
the mean of the dependent variable, depends on whether the nature of the independent variable is 
continuous or categorical. If the independent variable consists of continuous values, the Delta-p 
statistic indicates the change in the initial probability of belonging to the group assigned a value 
of one in the dependent variable associated with a one-unit increase in the independent variable. 
If the independent variable represents categories, the Delta-p statistic indicates the change in the 
initial probability of belonging to the group assigned a value of one in the dependent variable if 
the person is a member of the group assigned a value of one rather than a member of the group 

assigned a value of zero in the independent variable. 

To illustrate the calculation and interpretation of the Delta-p statistic, consider the results 
produced by the Graham (2002) study in which the mean of the dependent variable was .41 and 
the logistic coefficient for the continuous independent variable of years working with the 
principal was .161. The Delta-p value would be calculated as follows: 
ln(.4 1/(1— .4 1))+ .161 

6 _ A 1 

P C = . e ln(.41/(l-.41))+.161 "• 
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P c = .45 - .41 
P c = 04 

This value indicates that for an additional year of working with the principal the logistic model 
predicts an increase of .04 in the initial probability of belonging to the late-stage-of-concem 
group, which was .41. 

A Delta-p statistic can also be calculated for an independent variable that consists of 
categories. To illustrate the calculation and interpretation of the Delta-p value for a categorical 
independent variable, consider once again the results produced by the Graham (2002) study, in 
which the categorical independent variable of tenure used values of zero and one to represent 
teachers without and with tenure, respectively. Since the logistic coefficient for this tenure 
variable was -.32, the Delta-p value would be calculated as follows: 

ln(.41/ (1- .41))- .32 

P = 41 

c 1+e ln(.41/(l-.41))-.32 

P c = .34 - .41 
P c = -.07 

This value indicates that for the teachers with tenure, the logistic model predicts a decrease of 
.07 in the initial probability of belonging to the late-stage-of-concem group when compared to 
teachers without tenure. 

As previously demonstrated, researchers can calculate a Delta-p statistic for both 
continuous and categorical independent variables. It should be noted, however, that since there 
are no known procedures to estimate the statistical significance of a Delta-p statistic, St. John 
(1991) and Cabrera (1994) recommend that Delta-p statistics be calculated and interpreted only 
for statistically si gni fi cant logistics coefficients. It should be mentioned that the logistic 
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regression coefficient for the tenure variable in the Graham (2002) study was not statistically 
significant. The Delta-p statistic was calculated for the tenure variable only for the purpose of 
providing an illustration of how such a value is calculated and interpreted for a categorical 
independent variable. 

A Method Designed to Generate a Series of Changes in the Initial Probability Values 

Although both the odds ratio value and the Delta-p statistic methods of reporting and 
interpreting statistically significant logistic regression coefficients provide information that is 
somewhat more useful than that provided by the log odds values, we believe that even more 
useful information can be provided to practitioners by expanding on the concept incorporated 
into the Delta-p statistic method. We believe the best way to report coefficients in order for 
researchers and practitioners to deal with practical significance is in the form of the change in the 
probability of belonging to the group assigned one in the dependent variable, which is the basis 
of the Delta-p statistic method. However, unlike the Delta-p statistic method, we believe the 
change in the initial probability should be calculated for a range of initial probabilities rather 
than just one such value. 

The reason we believe the concept of the Delta-p statistic needs to be expanded centers 
on that fact that in logistic regression the actual data are not linearly related to the independent 
variable even though the log-transformed data are linearly related to it. Thus the change in the 
probability of belonging to the group assigned a value of one in the dependent variable is not 
constant for various initial probability values. Since the Delta-p statistic is calculated for only 
one initial probability value, it does not allow a practitioner to assess the degree of variability of 
the changes in the initial probability associated with a specified change in the independent 
variable for a relevant range of initial probability values. That is, a practitioner may find that the 



changes in the initial probabilities are practically significant for those people with certain initial 

probabilities but not for those with other initial probability levels. 

We believe the method that provides the most useful information in terms of allowing 
practitioners to judge the practical significance of a logistic regression coefficient is one that 
expands the Delta-p statistic method by including various initial probability values. The method 
we are proposing, along with our computer program, calculates the change in the probability 
value of belonging to the group assigned a value of one in the dependent variable for a specified 
change in an independent variable for a range of initial probability values. We believe that the 
changes in the initial probability values calculated through this method will provide practitioners 
a more complete picture regarding these changes, which will allow them to better deal with the 

issue of practical significance. 

Values Generated by the Proposed Method 

The method we are proposing produces one key series of probability values and three 

important individual probability values: (a) a series of changes in the probability of the event 
occurring for a given change in the independent variable that corresponds to a senes of initial 
probability values, (b) the Delta-p statistic, (c) the initial probability value at which the absolute 
value of the change in the probability of the event occurring is maximized for a given change m 
the independent variable, and (d) the change in the probability of the event occurring for a given 
change in the independent variable for which the absolute value of this change is maximized. 

We believe using these values will better enable practitioners to accomplish the task of gauging 

the practical significance of the logistic regression coefficient. 

A series of changes in the probability of an event occurring. A senes of probability 
values are calculated in which each value measures the change in the probability of belonging to 
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the group assigned a value of one in the dependent variable for a given change in an independent 
variable at a specified initial probability level. One issue that must be addressed regarding the 
calculation of these changes in the initial probability values is: What initial probability values 
should be used? We recommend that the range of the initial probabilities be determined by the 
minimum and maximum probabilities predicted by the logistic regression model for the sample. 

Once the minimum and maximum initial probability values are determined, an interval 
must be specified to generate a series of initial probabilities. We suggest that the use of a .05 
interval generally will provide enough information to assess the practical significance of the 
coefficient. Although the program is designed to handle intervals as small as .01, we believe one 
should generally use an interval of .05 or .10. The number of probability values generated by 
intervals smaller than .05 can be overwhelming, while the number of probability values 
generated by intervals larger than . 10 may provide too little information. 

After the series of initial probabilities has been generated, the final probability (P f ) of 
belonging to the group assigned a value of one in the dependent variable for a specified change 
in the independent variable, is calculated for each initial probability by using the followmg 

equation: 

ln(Pi/(l-Pi))+b 

C 

P f = 1+ e ln(Pi/(l-Pi))+b 

Once these final probability values are calculated, a series of corresponding changes in the 
probability values is calculated by subtracting each initial probability ftom its initial final 

probability. 

As is the case for the Delta-p statistic, the interpretation of these changes in the 
probability values depends on whether the independent variable is continuous or categorical. If 
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the independent variable consists of continuous values, these values indicate the changes in the 
probability of belonging to the group assigned a value of one in the dependent variable 
associated with the specified increase in the independent variable. If the independent variable 
represents categories, these values indicate the change in the initial probability of belonging to 
thfc group assigned a value of one in the dependent variable if the person is a member of the 
group assigned a value of one rather than a member of the group assigned a value of zero in the 
independent variable. 

An examination of these changes in the probabilities that the event will occur will assist 
practitioners in judging the practical significance of the logistic regression coefficient in two 
ways. First, the practitioners can readily gauge the change in the dependent variable associated 
with a given change in the independent variable in meaningful terms (i.e., the change in the 
probability of the event occurring). The practitioners can judge whether an increase of, say, .10 
in the probability that the event will occur for a given change in the independent variable is large 
enough to have educational importance (i.e., it is practically significant). Second, an 
examination of the series of changes in the probability values will reveal the degree of variability 
in these values across the range of initial probability values. Such an examination will allow the 
practitioners to determine whether the changes in the probability that the event will occur is 
practically significant across the entire range of initial probabilities or only for a specific 
subgroup of initial probability values. 

Individual probability values. In addition to the series of changes in the probability of the 
event occurring for a given change in the independent variable, practitioners may find three 
individual values useful in judging the practical significance of a logistic regression coefficient. 
The first of these values is the Delta-p statistic. As previously described, the Delta-p statistic 
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indicates the change in the initial probability value, which is set equal to the mean of the 
dependent variable, associated with a given change in the independent variable. This Delta-p 
statistic may be useful to practitioners in judging the practical significance of a logistic 
regression coefficient when the variability of the changes in the probability of the event 
occurring is minimal across the relevant range of initial probability values. In such a case, the 
Delta-p statistic, which is one value, may suffice as a means of judging the practical significance 

of the coefficient. 

The second individual value is the initial probability at which the largest absolute change 

in the probability associated with a specified change in the independent variable occurred. To 

identify this initial probability value, the numerator of the first derivative with respect to the 

initial probability of the following expression is set equal to zero. 

e ln(Pi/(l-Pi))+b 

Ue ln(Pi/(l-Pi))+b 

This procedure produced the following quadratic expression. 

1 - e b 

The initial probability value at which the absolute change in the probability is maximized is 
calculated by substituting the value of the logistic regression coefficient into the following 
portion of the quadratic expression: 




The third individual value is the actual value (i.e., the value with the sign included) for this 
maximized absolute value of the change in the initial probability. 
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The actual value of the maximum absolute value of the change in the probability and its 
corresponding initial probability may provide useful information to practitioners in two ways. 
First, if the actual value of the largest absolute change in the initial probability does not reach the 
level deemed practically significant by the practitioners, they will quickly know that practical 
significance is not reached at any point within the range of relevant initial probability values. 
Second, practitioners can identify whether the initial probability at which the absolute value of 
the change is maximiz ed is located at a point on the initial probability scale that is common or 
uncommon for the subjects included in the study. 

Computer Program Used to Generate Key Values 

The computer program we designed, which is used in conjunction with the Microsoft® 
Excel software, calculated and/or listed the following: (a) a series of initial probability values, (b) 
a series of corresponding final probability values associated with a given change in the 
independent variable, (c) a series of corresponding differences between the final and initial 
probability values, (d) the Delta-p statistic, (e) the initial probability value at which the change in 
the absolute value of the initial probability is maximized for a given change in the independent 
variable, and (f) the actual value of the maximum absolute change in the initial probability for a 
given change in the independent variable. In addition to generating these values, the program is 
designed to graph the curve depicting the relationship between the initial probability values (X 
axis) and the corresponding changes in the probability values (Y axis) for a given change in the 
independent variable. This visual presentation may assist the researcher in assessing the degree 
of change in the probability levels throughout the range of initial probability values. 

A description of the input values used in the program. The researcher must identity six 
key input values, which are entered into the first page of the computer program. To assist in 
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describing these values and how they are entered into the program, a copy of the first page that 
contains the six key values used in the Graham (2002) study is presented in Figure 1. 



Insert Figure 1 about here 



The first two values entered into the program are: (a) the size of the change in the 
independent variable specified by the researcher and (b) the statistically significant logistic 
regression coefficient being evaluated. These values are entered into the rows entitled "Change 
in the Independent Variable" and "Coefficient Value," respectively. In the Graham (2002) study 
values of 1 and . 1 6 1 were entered for the change in the independent variable and the logistic 
regression coefficient, respectively (see Figure 1). 

The third value entered into the program is the proportion of subjects in the sample who 

belonged to the group assigned a value of one in the dependent variable. This value, which is 
equal to the mean of the dependent variable, is entered into the row entitled "Proportion of Cases 

with a Value of 1." The proportion of subjects in the sample used in the Graham (2002) study 

was .41 (see Figure 1). 

Since we are interested in generating a series of changes in the probability of the event 
occurring for a corresponding series of initial probability values, the researcher must identify a 
relevant range for these values. Thus the fourth and fifth numbers entered into the program are 
the minimum and maximum values of the initial probability values. These values are entered 
into the rows entitled "Minimum Value" and "Maximum Value." We suggest the minimum and 
maximum values be set approximately equal to the minimum and maximum probability values 
predicted by the logistic regression model for the subjects contained in the sample. The 
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approximate minimum and maximum probability values for the Graham (2002) study were .20 
and .70, respectively (see Figure 1). 

The sixth value entered into the program is an interval that will be used to generate a 
series of initial probability values located between the minimum and maximum initial 
probabilities values, which were previously identified. The specified interval is placed in the 
row entitled “Interval.” The interval was set at .05 in the Graham (2002) study (see Figure 1). 

A description of the output values produced by the program. Once the six input values 
are entered into the first page of the program, three series of probability values and three 
individual probability values are generated and listed on the second page of the program. To 
assist in describing these values, a copy of the second page of the program for the Graham 
(2002) study is contained in Figure 2. 

Insert Figure 2 about here 

The first of the three series of probability values consists of the initial probability values, 
which are listed in the program under the heading "Initial P Values." These values begin with 
the minimum initial probability value stipulated on the first page of the program and increase by 
the specified interval until the stipulated maximum initial probability is reached. In the Graham 
(2002) study these values ranged from the minimum initial probability value of .20 and increased 
by intervals of .05 until the maximum probability value of .70 was reached (see Figure 2). 

The second series of values contains the final probability values, which are calculated 
from the initial probability values. These final probability values, which indicate the probability 
of the event occurring once the independent variable was increased by the specified amount, are 
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listed in the program under the heading "Final P Values." Eleven final probability values were 

calculated for the Graham (2002) study (see Figure 2). 

The third series of numbers consists of the differences between the final and initial 
probability values. These values, which represent the changes in the initial probability values 

associated with the specified change in the independent variable, are listed in the program under 

the heading "Change in P Values." These changes in the initial probability values ranged from a 

low of .027 to a high of .040 for the Graham (2002) study (see Figure 2). 

In addition to these three series of numbers, the second page of the program contains 
three individual probability values. The first of these values is the Delta-p statistic. This value is 
listed in the row entitled "Delta-p Statistic for 'p'," where the mean of the dependent variable is 
actually printed rather than the symbol p. As previously described, the Delta-p statistic indicates 
the change in the initial probability value, which is set equal to the mean of the dependent 
variable, associated with a specified change in the independent variable. The Delta-p statistic for 

the Graham (2002) study was .040 (see Figure 2). 

The second of the three individual values listed on the second page of the program is the 

initial probability at which the largest change in the initial probability occurs for the specified 
cha nge in the independent variable. This value is entered in the row entitled "P value at which 
the absolute change in P is the largest for the specified change in the independent variable." In 
the Graham (2002) study, the change in the initial probability value at which the absolute change 
was the largest for the specified change of one year in the years working with the principal was 

.480 (see Figure 2). 

The last of the three individual values listed on the second page of the program is the 
actual value of the maximum absolute change in the initial probability for the specified change m 
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the independent variable. This value, winch ocons at the initial probability previously identified 
by the program, is listed in the row entitled "Largest absolute change in P expressed as P." In the 
Graham (2002) study the maximum change in the probability of a teacher belonging to the lat 
stage-of-concem group associated with a one-year change in the number of years working with 
the principal was .040, which, as previously noted, occurred at the initial probability of .480 (see 
Figure 2). It should be noted that although this maximum value of .040 appears to equal the 
Delta-p statistic, which was recorded for a different initial probability value, it does exceed the 

Delta-p statistic if a sufficient number of decimal places were displayed. 

The third page of the program contains a graph of the changes in the probability values 
(Y axis) with their corresponding initial probability levels (X axis). The purpose of the graph is 
to provide the researcher a visual depiction of this relationship. The graph generated 
Graham (2002) study is presented in Figure 3. 



Insert Figure 3 about here 



Assessing the Graham Study Information 

The statistically significant logistic regression for the independent variable representing 
the number of years the teachers worked with the current principals, which was .161, indicated 
that a one-year increase was associated with a .161 log odds change in belonging to the late- 
stage-of-concem group. We believe that most practitioners would find that such an 
interpretation provides little useful information with respect to judging the practical significance 
of the coefficient. Although somewhat more useful, we believe that practitioners would also find 
the corresponding odds value of 1 .17 difficult to use when judging the practical significance of 
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the coefficient. We believe that practitioners will find the change in probability values 
corresponding to a specified change in the independent variable, which are listed on the second 
page of our computer program do assist them in judging the practical significance of the 
coefficient. 

A review of the values listed on the second page of the computer program for the Graham 
(2002) study reveals that the changes in the initial probability values for a one-year change in the 
teachers' years of working with their principal ranged from a minimum of .027 at the initial 
probability level of .20; reached a maximum of .04 at the initial probability of .48; and declined 
in size to .033 at the initial probability of .70 (see Figure 2). The graph presented on the third 
page of the program provides a visual aid depicting this relationship between the changes in the 
initial probability values and their corresponding initial probability values (see Figure 3). 

After reviewing these values, two conclusions can be made. First, the differences among 
the changes in the initial probability values associated with a one-year change were not 
substantial across the range of initial probability values. Thus the level of practical significance 
of the logistic regression coefficient can be accurately judged by using the single value of the 
Delta-p statistic, which was .04. Second, using the Delta-p value, it was concluded that the 
educational significance of the change in moving teachers from the early- stage-of-concem group 
to the late-stage-of-concem group was minimal. That is, the change in the probability is 
estimated to be approximately .04. 

One very important point must be considered, however, before a final conclusion is made 
regarding the practical significance of this statistically significant logistic regression coefficient. 
That is, before practitioners judge the practical significance of changes in the probability of 
belonging to the group assigned a value of one in the dependent variable, they may want to 
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eva.ua, e ,he changes associated with mu.riple uni, changes in the independent vatiab.e (NoruSis, 
,999 pp 39-40). This point is demonstrated by the changes in the probability levels 
corresponding to changes in the number of yearn of working with the principal that exceed one 
year (Graham, 2002). To illustrate this point, consider die differences in the chang 
initial probability levels of teachers who have four additional years of experience working with 
their principals, which is approximately one smndard deviation uni, of this independent variable. 
To obtain these probability values, die researcher would specify the change in the independent 
variable as 4 rather than one and enter this value next the ride "Change in die Independent 

Variable" in the computer program. 

The series of changes in die probability values generated by die program, which are listed 

in Figure 4, indicate that the changes in the initial probability values for teachers with four 

: working with their principals compared to teachers without the additional four 

years range from a minimum of . 123 at die initial probability level of .20; reached a maximum of 

.16 at the initial probability of . 42 ; and declined in size to .116 at the initial probability, of .70. 

These values suggest that although die increase in the probability that teachers would be 

■ ♦« tVip late-staee-of-concem group associated 

reclassified from the early-stage-of-concern group 

: working with the principal may be judged to have minimal 



additional years ' 



with a one-year change in the years 
educational importance, difference between 
with the principal may be judged to have educational importance. 



teachers with litde and significant work expenence 



Insert Figure 4 about here 
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Summary 

Program evaluators and educational researchers who use logistic regression to analyze 
dichotomous dependent variables are faced with the task of assisting practitioners in judging 
whether a statistically significant coefficient is also practically significant. The method and the 
computer program presented in tins paper are designed to provide program evaluators and 
researchers with values that will assist them in assessing the practical significance of such a 
coefficient The program requires the researcher to enter six values: (a) a specified change in the 
independent variable, (b) the logistic coefficient value, (c) the proportion of the subjects m the 
sample assigned a value of one in the dependent variable, <d> a minimum initial probability 
value, (e) a maximum initial probability value, and © an interval used to generate a soies of 

values between the minimum and maximum initial probability values. 

Once these six values are entered into the program, k generates a series of changes in the 
initial probabilities associated whh a specified change in the independent variable. In addition to 
this series of values, the program calculates three individual values. The first of thes 
the Delta-p statistic. This value indicates the change in the initial probability value, which is set 
equal to the mean of the dependent variable, associated with a specified change in the 
independent variable. The second individual value is tile initial probability a. which the larges, 
absolute change in the initial probability occurs. The third individual value represents the actual 
value of the maximum absolute value of the change in the initial probability for a specified 
change in the independent variable. In addition to these values, the pmgram also provides a 
graph that depicts the relationship between the changes in tire probability values (Y axis) and 

their corresponding initial probability levels (X axis). 
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We believe that the values generated by the method and the computer program described 
in this paper will provide researchers, program evaluators, and practitioners with information that 
will assist them in judging the practical significance of a logistic regression coefficient in a 
number of ways. Fust, practitioners should find the changes in probability values to be easier to 
use when judging the practical significance ofthe coefficient as compare to the odds ratio value 
(i.e„ the logistic regression coefficient) or even the odds ratio value. Second, the series of 
changes in the initial probability values associated wifi, a specified change in the independent 
variable will allow practitioners to judge wheflier the Delta-p value provides enough information 
to judge the practical significance ofthe coefficient Third, if a practitioner believes the practical 
significance ofthe coefficient should be judged in terms of changes in the independent variable 
that are neater than one unit, fi-e program can easily generate the essential values that reflect 

such a change. 

The need to address file practical significance of statistically significant findings is an 
important task for researchers, program evaluators, and practitioners to undertake. We believe 
the implementation of the method through file use ofthe computer program, as described in this 
paper, will assist researchers, program evaluators, and pracntioners m the task of accessing the 
practical significance of logistic regression coefficients. 
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Appendix A 

Interested readers can obtain the worksheets used to calculate the various probability 
estimates for a given logistic regression coefficient as discussed in the previous sections in either 
of two ways. They can design the required worksheets by utilizing the Microsoft® Excel 
computer software in conjunction with the commands outlined in this section or they can request 
copies of the program from the authors. Send such requests to the following e-mail address: 

jfraas@ashland.edu. 

The commands used in conjunction with the Microsoft® Excel computer software, which 
are presented in this section, are divided into four sections. The first set of commands is 
designed to label the three worksheets included in the spreadsheet. These commands, which are 

printed in bold type, are as follows: 

1. Double-click on the tab for Sheet 1. 

2. Enter the name Parameters. 

3. Double-click on the tab for Sheet 2. 

4. Enter the name Calculations. 

5. Double-click on the tab for Sheet 3. 

6. Enter the name Chart. 

These commands name the three worksheets Parameters, Calculations, and Chart. 

The second set of commands is used to construct the Parameters worksheet. These 

commands are as follows: 

1. Click on the tab titled Parameters 

2. Enter in the specified cells the following text: 

(a) Cell A1 - The Calculation of the Changes in Initial Probability 

0 
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(b) Cell A2 - Values for Logistic Regression Coefficients 

(c) Cell A3 -- Change in the Independent Variable: 

(d) Cell A4 — Coefficient Value: 

(e) Cell A5 — Proportion of Cases with a Value of 1: 

(f) Cell A6 — Minimum Value: 

(g) Cell A7 -- Maximum Value: 

(h) Cell A8 - Interval: 

(i) Cell C3 -- [the size of change in the independent variable] 

0 Cell C4 -- [the logistic regression coefficient for the variable] 

(k) Cell C5 -- [average of the dependent variable] 

(l) Cell C6 - [the lowest predicted probability for the sample data] 

(m) Cell C7 - [the highest predicted probability for the sample data] 

(n) Cell C8 — [interval size between the min. and max. values] 

3. Change the background of the range B3:B8 to indicate the location of the data to 

be entered. 

The third set of commands is used to construct the Calculation worksheet. T 
commands are as follows: 

1. Click on the tab titled Calculation. 

2. Enter in the specified cells the following text. 

(a) Cell A1 - Initial P Values 

(b) Cell B1 - Final P Values 

(c) Cell Cl -- Change in P Values 

(d) Cell D4 - Delta-P Statistic for 
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(e) Cell D7 — P value at which the absolute change in P 

(f) Cell D8 - is the largest for the specified change in the 

(g) Cell D9 — independent variable 

0 Cell Dll — Largest absolute change in P expressed as P 

Enter in the specified cells the following formula. 

(a) Cell A2 -- =Parameters!B 6 

(b) Cell A3 - =IF(A2=" V MF(A 2 +Parameters!$B$ 8 < 

(Parameters!$B$7+0.0001) ,A2+Parameters!$B$8," ")) 

(c) Cell B2 - =A2+C2 

(d) Cell B3 -- =1F(A3=" V ",A3+C3) 

(e) Cell C2 - =IF(A2=” ”,RANDO*0+(EXP((LN(A2/(l-A2)) 

+(Parameters!$B$4*Parameters!$B$3))))/(l+EXP((LN(A2/(l-A2)) 

+(Parameters!$B$4*Parameters!$B$3))))-A2) 

(f) Cell C3 - =IF(A3=" V ",(EXP((LN(A3/(1-A3)) 

+(Parameters!$B$4*Parameters!$B$3))))/(l+EXP((LN(A3/(l-A3)) 

+(Parameters!$B$4*Parameters!$B$3))))-A3) 

(g) Cell F4 -- =(EXP((LN(F2/(l-F2))+(Parameters!$B$4*Parameters!$B$3))))/ 

(l+EXP((LN(F2/(l-F2))+(Parameters!$B$4*Parameters!$B$3))))-F2 

(h) Cell F9 - =(l-SQRT(EXP((Parameters!B4*Parameters!$B$3))))/(l- 

(EXP((Parameters!B4*Parameters!$B$3)))) 

(j) Cell FI 1 - =(EXP((LN(F9/(l-F9))+(Parameters!$B$4*Parameters!$B$3))))/ 

(l+EXP((LN(F9/(l-F9))+(Parameters!$B$4*Parameters!$B$3))))-F9 
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4. Copy the range A3:C3 and paste it into the range A4:C101 

5. Insert the following names: 

(a) “change in p” refers to = OFFSET(‘The Calculation’ !$B$2,0,0, COUNT 

(‘The Calculation’ !$B$2:$B$101),1) 

(b) “interval” refers to =OFFSET(‘The Calculation’ !$A$2, 0,0, COUNTA 

(‘The Calculation’ !$A$2:$A$101),1) 

6. Save the workbook with the name ‘Fraas’ 

The fourth set of commands is used to construct the Chart worksheet. These commands 

as follows: 

1 . Click on the tab entitled The Calculation 

2. Turn off the legend by clicking on the Show Legend box 

3. Click on Next> 

4. Click on the button for ‘As object in:' 

5. Enter ‘The Chart’ 

6. Click on the tab titled The Chart 

7. Move cursor to the body of the chart and right click 

8. Click on ‘Source Data...’ 

9. In the Values: box enter =Fraas!change_in_p 

10. In the Category (X) axis labels box enter =Fraas!interval 

11. Click on OK 

12. Save the Workbook 
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The Calculation of the Changes in Initial Probability 
Values for a Logistic Regression Coefficient 



Change in the Independent Variable: 


1 


[the size of change in the independent variable] 


Coefficient Value: 


0.161 


[the logistic regression coefficient for the variable] 


Proportion of Cases with a Value of 1: 


0.41 


[average of the dependent variable] 


Minimum Value: 


0.2 


[the lowest predicted probability for the sample data] 


Maximum Value: 


0.7 


[the highest predicted probability for the sample data] 


Interval: 


0.05 


[interval size between the min. and max. values] 



Figure 1 

Page 1 of the Computer Program that Contains the Results of the Graham Study with the Change 

in the Independent Variable Set at One Year 
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: Initial P Values: Final P Values 


Change in P Values 




0.20 


0.227 


0.027 




0.25 


0.281 


0.031 




0.30 


0.335 


0.035 


Delta-p Statistic for 10.41 


0.039 


0.35 


0.387 


0.037 




0.40 


0.439 


0.039 




0.45 


0.490 


0.040 


P value at which the absolute change in P 


0.50 


0.540 


0.040 


is the largest for the specified change in the 


0.55 


0.589 


0.039 


independent variable 


0.480 


0.60 


0.638 


0.038 




0.65 


0.686 


0.036 


Largest absolute change in P expressed as P 


0.040 


0.70 


0.733 


0.033 





Figure 2 



Page 2 of the Computer Program that Contains the Results of the Graham Study with the Change 

in the Independent Variable Set at One Year 
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Change in Initial Probability Values 




Figure 3 

Page 3 of the Computer Program that Contains the Results of the Graham Study with the Change 

in the Independent Variable Set at One Year 
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Initial P Values ! 

0.20 

0.25 


Final P Values; 

0.323 

0.388 


Change in P Values 

0.123 

0.138 1 


A A AA 1 


0.30 


0.449 




0.160 1 


6.35 


0.506 






6.40 


6559 


6.159 




6.45 


0.609 




P value at which the absolute change in r 




0.50 


0.656 


0.156 


is the largest for the specified change in the 
independent variable 


0.420 


- 6.55 


0.699 


0.149 


0.60 


0.741. 




A 4CA~I 


0.65 


0.780 




0.160 | 


0.70 


0.816 






izizzzzzzzzzzz.., 1 - 





Figure 4 

Page 1 of the Computer Program that Contains the Results of the Graham Study with the Change 

in the Independent Variable Set at Four Years 
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